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On the Multiple Access Channel with Asymmetric Noisy 
State Information at the Encoders 

Nevroz §en, Fady Alajaji, Serdar Yiiksel and Giacomo Como 

Abstract 

We consider the problem of reliable communication over multiple-access channels (MAC) where the 
channel is driven by an independent and identically distributed state process and the encoders and the 
decoder are provided with various degrees of asymmetric noisy channel state information (CSI). For the 
case where the encoders observe causal, asymmetric noisy CSI and the decoder observes complete CSI, 
we provide inner and outer bounds to the capacity region, which are tight for the sum-rate capacity. We 
then observe that, under a Markov assumption, similar capacity results also hold in the case where the 
receiver observes noisy CSI. Furthermore, we provide a single letter characterization for the capacity 
region when the CSI at the encoders are asymmetric deterministic functions of the CSI at the decoder 
and the encoders have non-causal noisy CSI (its causal version is recently solved in [1]). When the 
encoders observe asymmetric noisy CSI with asymmetric delays and the decoder observes complete CSI, 
we provide a single letter characterization for the capacity region. Finally, we consider a cooperative 
scenario with common and private messages, with asymmetric noisy CSI at the encoders and complete 
CSI at the decoder. We provide a single letter expression for the capacity region for such channels. 
For the cooperative scenario, we also note that as soon as the common message encoder does not have 
access to CSI, then in any noisy setup, covering the cases where no CSI or noisy CSI at the decoder, 
it is possible to obtain a single letter characterization for the capacity region. The main component in 
these results is a generalization of a converse coding approach, recently introduced in [1] for the MAC 
with asymmetric quantized CSI at the encoders and herein considerably extended and adapted for the 
noisy CSI setup. 

This work was supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC). 

The material in this paper was presented in part at Forty-Ninth Annual Allerton Conference on Communication, Control, and 
Computing, Monticello, IL, September 2011. 

N. §en, F. Alajaji and S. Yiiksel are with the Department of Mathematics and Statistics, Queen's University, Kingston, ON 
K7L 3N6, Canada (email:{nsen,fady,yuksel}@mast.queensu.ca). 

G. Como is with the Department of Automatic Control, Lund University, SE-221 00 Lund, Sweden 
(email : giacomo @ control . 1th. se) . 



January 20, 2012 



DRAFT 



2 



Index Terms 
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Cooperative multiple-access channel, Converse coding theorem. 

I. Introduction and Literature Review 

Modeling communication channels with a state process, which governs the channel behavior, fits 
well for many physical scenarios. For single-user channels, the characterization of the capacity with 
various degrees of channel state information at the transmitter (CSIT) and at the receiver (CSIR) is well 
understood. Among them, Shannon [2] provides the capacity formula for a discrete memoryless channel 
with causal noiseless CSIT, where the state process is independent and identically distributed (i.i.d.), in 
terms of Shannon strategies (random functions from the state space to the channel input space). In [3] 
Gel'fand and Pinsker consider the same problem with non-causal side information and establish a single- 
letter capacity formula. In [4], noisy state observation available at both the transmitter and the receiver 
is considered and the capacity under such a setting is derived. Later, in [5] this result is shown to be a 
special case of Shannon's model and the authors also prove that when CSIT is a deterministic function 
of CSIR optimal codes can be constructed directly on the input alphabet. In [6], the authors examine 
the discrete modulo-additive noise channel with casual CSIT which governs the noise distribution, and 
they determine the optimal strategies that achieve channel capacity. In [7] fading channels with perfect 
channel state information at the transmitter is considered and it is shown that with instantaneous and 
perfect CSI, the transmitter can adjust the data rates for each channel state to maximize the average 
transmission rate. In [8], a single letter characterization of the capacity region for single-user finite-state 
Markovian channels with quantized state information available at the transmitter and full state information 
at the decoder is provided. In a closely related direction, finite-state channels (with memory) with output 
feedback is investigated in [9]. In particular, [9] shows that it is possible to formulate the computation 
of feedback capacity as a stochastic control problem. In [10], finite-state channels with feedback, where 
feedback is a time-invariant deterministic function of the output samples, is considered. 

The literature on finite state multiple access channels (FS-MAC) with different assumptions of CSIR 
and CSIT (such as causal vs non-causal, perfect vs imperfect) is extensive and the main contributions of 
the current paper have several interactions with the available results in the literature, which we present in 
Subsection I-A. Hence, we believe that in order to suitably highlight the contributions of this paper, it is 
worth to discuss the relevant literature for the multi-user setting in more detail. To start, [11] provides a 
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multi-letter characterization of the capacity region of time-varying MACs with general channel statistics 
(with/without memory) under a general state process (not necessarily stationary or ergodic) and with 
various degrees of CSIT and CSIR. In [11], it is also shown that when the channel is memoryless, if the 
encoders use only the past k asymmetric partial (but not noisy) CSI and the decoder has complete CSI, 
then it is possible to simplify the multi-letter characterization to a single letter one [11, Theorem 4]. In 
[12], a general framework for the capacity region of MACs with causal and non-causal CSI is presented. 
In particular, an achievable rate region is presented for the memoryless FS-MAC with correlated CSI 
and the sum-rate capacity is established under the condition that the state information available to each 
encoder are independent. In [13], MACs with complete CSIR and noncausal, partial, rate limited CSITs 
are considered. In particular, for the degraded case, i.e., the case where the CSI available at one of the 
encoders is a subset of the CSI available at the other encoder, a single letter formula for the capacity 
region is provided and when the CSITs are not degraded, inner and outer bounds are derived, see [13, 
Theorems 1, 2]. In [14], memoryless FS-MACs with two independent states, each known causally and 
strictly causally to one encoder, is considered and an achievable rate region, which is shown to contain 
an achievable region where each user applies Shannon strategies, is proposed. In [15], another achievable 
rate region for the same problem is proposed and in [16] it is shown that this region can be strictly larger 
than the one proposed in [14]. In [14] it is also shown that strictly casual CSI does not increase the sum- 
rate capacity. In [17] the finite-state Markovian MAC with asymmetric delayed CSITs is studied and its 
capacity region is determined. Another active research direction on the FS-MAC regards the so-called 
cooperative FS-MAC where there exists a degraded condition on the message sets. In particular, [18] and 
[19] characterize the capacity region of the cooperative FS-MAC with states non-causally and causally 
available at the transmitters. For more recent results on the cooperative FS-MAC problem see references 
[20] and [21]. Finally, for a comprehensive survey on channel coding with side information see [22]. 

The most relevant work to this paper is [1], which presents a single letter characterization of the 
capacity region for memoryless FS-MAC in which transmitters observe asymmetric partial quantized 
CSI causally, and the receiver has full CSI. In the converse part of this work, which we discuss in more 
detail below, the authors use team decision theoretic methods [23] (see also [24], [25] and [26] for recent 
team decision and control theoretic approaches). When a comparison of this result with the previously 
mentioned results is made, we observe the following: i) it shows that when the state process is i.i.d. there 
is no loss of optimality if the encoders use a window size of k = 1 in [11, Theorem 3], ii) it extends 
the causal part of result [12, Theorem 5] to the case where CSITs are not independent, and finally, in) 
it partially answers the setup in [13, Theorem 2] with the assumption that CSITs are causal. 
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A. Main Contributions and Connections with the Literature 

We consider several scenarios where the encoders and the decoder observe various degrees of noisy 
CSI. The essential requirement we impose is that the noisy CSI available to the decision makers is realized 
via the corruption of CSI by different noise processes, which give a realistic physical structure of the 
communication setup. We herein note that the asymmetric noisy CSI assumption is acceptable as typically 
the feedback links are imperfect and sufficiently far from each other so that the information carried through 
them is corrupted by different (independent) noise processes. Finally, what makes (asymmetric) noisy 
setups particularly interesting are the facts that 

(a) No transmitter CSI contains the CSI available to the other one; 

(b) CSI available to the decoder does not contain any of the CSI available to the two encoders. 
When existing results, which provide a single letter capacity formulation, are examined, it can be observed 
that most of them do not satisfy (a) or (6) or both (e.g., [1], [11], [12], [13], [17]). Nonetheless, among 
these, [11] discusses the situation with noisy CSI and the authors make the observation that the situation 
where the CSI at the encoders and decoder are noisy versions of St can be accommodated by their 
models. However, they also note that if the noises corrupting transmitters and receiver CSI are different, 
then the encoder CSI will, in general, not be contained in the decoder CSI. Hence, motivated by similar 
observations in the literature (e.g., [12]), we partially treat the scenarios below and provide inner and 
outer bounds, which are tight for the sum-rate capacity, for the scenarios (1) and (la) and provide a 
single letter characterization for the capacity region of the latter scenarios: 

(1) The memoryless FS-MAC in which each of the transmitters has an asymmetric causal noisy CSI 
and the receiver has complete CSI (Theorems 2.1, 2.2 and Corollary 2.1). 

(la) The memoryless FS-MAC in which each of the transmitters has an asymmetric causal noisy 
CSI and the receiver has also noisy CSI (Corollaries 2.2, 2.3 and 2.4). 

(lb) The memoryless FS-MAC in which each of the transmitters has an asymmetric causal and non- 
causal noisy CSIT which is a deterministic function of the noisy CSIR at the receiver (Theorem 
2.3). 

(2) The memoryless FS-MAC in which each of the transmitters has an asymmetrically delayed and 
asymmetric noisy CSI and the receiver has complete CSI (Theorem 3.1). 

(3) The cooperative memoryless FS-MAC in which both transmitters transmit a common message and 
one transmitter (informed transmitter) transmits a private message. The informed transmitter has 
causal noisy CSI, the other encoder has a delayed noisy CSI and the receiver has various degrees 
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of CSI (Theorems 4.1 and 4.2). 
Let us now briefly position these contributions with respect to the available results in the literature. 
The sum-rate capacity determined in (1) and (la) can be thought as an extension of [12, Theorem 4] 
to the case where the encoders have correlated CSI. The causal setup of (16), with the observation of 
the existence of an equivalent channel, is solved in [1]. The solution that we provide to the non-causal 
case partially solves [13] and extends [12, Theorem 5] to the case where the encoders have correlated 
CSI. Furthermore, since the causal and non-causal capacities are identical for scenario (lb), the causal 
solution can be considered as an extension of [5, Proposition 1] to a noisy multi-user case. Finally, (3) 
is an extension of [18, Theorem 4] to a noisy setup. 

B. The Converse Coding Approach 

In this work, we adopt and expand on the converse technique presented in [1] and use it in a noisy 
setup. The converse coding approach of [1] is based on using memoryless stationary team policies which 
play a key role in showing that the past information is irrelevant. This is obtained by showing that under 
any policy that one can achieve using an arbitrary decentralized coding policy, the same performance 
can be achieved by using memoryless stationary team policies. More specifically, this is accomplished 
in two steps. In the first step, it is shown that any achievable rate pair can be approximated with the 
convex combinations of conditional mutual information terms which are indexed by the past CSIR. In 
the second step, the conditional probability distribution, for which these conditional mutual information 
terms are a function of, is examined. With the observation that the past CSIR only affects the "controls," 
i.e., memoryless stationary team policies, taking the convex hull associated to all possible such controls 
completes the converse part. However, as the authors mention in [1, Remark 2], for the validity of the 
above arguments, it would suffice that the state information available at the decoder contains the one 
available at the two transmitters. In this way, the decoder does not need to estimate the coding policies 
used in decentralized time-sharing. 

For the noisy setup, we need to modify this approach to account for the fact that the decoder does not 
have access to the state information at the encoders, and that the past state information does not lead to 
a tractable recursion. This difficulty is overcome by showing that a product form on the team policies 
exists in the noisy setup as well. 

The rest of the paper is organized as follows. In Sections II, III and IV, we formally state scenarios 
(l)-(lb), (2) and (3), respectively, and present the main results and several observations. In Section V, 
we provide two examples in one of which (the modulo-additive FS-MAC) we apply the result of [6] and 
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get the full capacity region by only considering the tightness of the sum-rate capacity. Finally, in Section 
VI, we present concluding remarks. 

Throughout the paper, we will use the following notations. A random variable will be denoted by an 
upper case letter X and its particular realization by a lower case letter x. For a vector v, and a positive 
integer i, vi will denote the z-th entry of v, while vu\ = (vi, ■ ■ ■ ,vi) will denote the vector of the first i 
entries and vua = (vi, ■ ■ ■ , Vj), i < j will denote the vector of entries between i,j of v. For a finite set 
A, V(A) will denote the simplex of probability distributions over A. Probability distributions are denoted 
by P(-) and subscripted by the name of the random variables and conditioning, e.g., P UT] y S (u,t\v,s) 
is the conditional probability of (U = u,T = t) given (V = v,S = s). Finally, for a positive integer 
n, we shall denote by A^ := \Jo <s<n A s trie set °f -4-strings of length smaller than n. We denote the 
indicator function of an event E by 1{e}- All sets considered hereafter are finite. 

II. Asymmetric Causal Noisy CSIT and Complete CSIR 

Consider a two-user memoryless FS-MAC, with two encoders, a, b, and two independent message 
sources W a and which are uniformly distributed in the finite sets W a and Wf,, respectively. The 
channel inputs from the encoders are X a G X a and X b G Xb, respectively, and the channel output is 
Y e y. The channel state process is modeled as a sequence {5t}^ 1 of random variables in some finite 
space S. The two encoders have access to a causal noisy version of the state information St at each time 
t > 1, modeled by Sf € S a , S\ G Sb, respectively, where the joint distribution of (St, «S*f, S% ) factorizes 
as 

Pst,si,s t ( a t>sl*t) = P s? \sM\st)Ps>\sM\st)PsM)- (1) 

We also assume that St is fully available at the receiver (see Fig. 1) and that {(St, Sf, S% is a 
sequence of independent and identically distributed triples, independent from (W a ,Wb). Therefore, we 
have that for any n > 1, 

n I I 

P %],S [ -pS» n]) W a ,W- t (S[n],sf„],afn]' 7i, a' W 6) = ]l^^^\sMt\ s t)Psi\sS4\st)Ps t (st)- (2) 

The channel inputs at time t, i.e., Xf and X\, are functions of the locally available information (W a , S^) 
and (W b ,S^), respectively. Let W := (W a ,W h ) and X t := (Xf,X^), respectively. Then, the laws 
governing n-sequences of state, input and output letters are given by 

n 

PY ln] \W,X ln] ,S ln] ,S? n] ,S» n] (V[n] |W, X [n] , S [n] , Sf n] , S b [n] ) = Y[ P Yt \X?,X»,S t (yt\ X ?> X U St), (3) 

t=l 
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Fig. 1. The multiple-access channel with asymmetric causal noisy state feedback. 



where the channel's transition probability distribution, PY t \x?,xi,St(yt\ x ti x t> s t)> * s given a priori. 

Definition 2.1: An (n, 2 nR % 2 nRb ) code with block length n and rate pair (R a ,Rb) for an FS-MAC 
with causal noisy state feedback consists of 
(1) A sequence of mappings for each encoder 

4 a) : Si x W a -> X a , t = l,2,...n; 
4 b) :S£xW b ^X b , t = l,2,...n. 



2) An associated decoding function 



i; :S n xy n ^W a x W b . 



The system's probability of error, P e , is given by 

P ^ n) = E E p (Wm^w) / K,^ 6 )iw = w ) . 

w a =l w b =l 

A rate pair (R a , R b ) is achievable if for any e > 0, there exists, for all n sufficiently large an (n, 2 nRa , 2 niJb ) 
code such that ^ log |>V a | > R a > 0, ±log|W 6 | > R b > and P e (n) < e. The capacity region of the 
FS-MAC, Cps, is the closure of the set of all achievable rate pairs (R a ,R b ) and the sum-rate capacity 
is defined as C^ s := max {RatRb)€CFS (R a + R b ). 

Before proceeding with the main result, we introduce memoryless stationary team policies [1] and 
their associated rate regions. Let the set of all possible functions from S a to X a and S b to X b be denoted 
by T a ■= X a Sa and % := X b Sb , respectively. We shall refer to 7^-valued and 7&-valued random vectors 
as Shannon strategies. 
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Definition 2.2: [1] A memory less stationary (in time) team policy is a family 

II = {vr = (7r r .(-),7r T *(-)) G V{T a ) x V(%)} (4) 

of probability distribution pairs on (T a ,%). 

For every memoryless stationary team policy tt, let IZfs^) denote the region of all rate pairs R = 
(R a ,Rb) satisfying 

R a < I{T a ;Y\T\S) (5) 

R h < I(T b ;Y\T a ,S) (6) 

R a + R b < I(T a ,T b ;Y\S) (7) 

where S, T a , T b and Y are random variables taking values in S, T a , % and y, respectively, and whose 
joint probability distribution factorizes as 

Ps,T«,T>,Y(s,t a ,t b ,y) = Ps(s)P YlT%T >,s(y\t a ,t b ,s)Tr T «(t a )Tr T 4t b ). (8) 

Let Cjn '■= co^ Utt ^Fsi^i^j denote the closure of the convex hull of the rate regions IZpsi^) given by 
(5)-(7) associated to all possible memoryless stationary team polices as defined in (4). We now present 
an inner bound and an outer bound to the capacity region. The latter bound is obtained by providing a 
tight converse to the sum-rate capacity. 

Theorem 2.1 (Inner Bound to Cfs) : Cin ^ Cfs- 
The achievability proof follows the standard arguments of joint e-typical n-sequences [27, Section 15.2]. 

Definition 2.3: [27] Fix integer k > 1. The set A™ of e-typical n-sequences {(x^,-- - ,^])} with 
respect to the distribution Px 1 ,--- ,x k (x 1 , ■ ■ ■ ,x k ) = Y\i=i Px*{x l ) is defined by 

A n e = |(xf n] , • • • , x\ n] ) G X 1 x • • • X k : | - I log (P(u)) - H{U)\ < e, VU C {X\ • • • , 

where u denotes an ordered sequence in xi-, , • • • , x$i corresponding to U. 
Proof of Theorem 2.1: Fix (R a ,Rb) G 1Zfs{^)- 
Codebook Generation Fix itT a {t a ) and iiT'>{t b ). For each w a G {1, • • • , 2 nRa }, randomly generate its 
corresponding n-tuple tfy w , each according to fllLi ^^V" Similarly, For each wj, G {1, • • ■ , 2 nRb }, 

randomly generate its corresponding n-tuple ij^ each according to Yl™ =1 ^T b {Aw b )- The set °f these 
codeword pairs form the codebook, which is revealed to the decoder while codewords t\ Wi are revealed 
to encoder /, / = {a, b}. 

Encoding Define the encoding functions as follows: xf(w a ) = <j>f(w a , s|) = t^ Wa (sf) and x b (wb) = 
4>i(wb, s^) = tl Wb ( s i) where l \w a and l \,w h denote the «th component of t\ n ^ Wa and t[ n] u , b , respectively, 
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and sf and s b denote the last components of and sj^, respectively, i = 1, • • ■ , n. Therefore, to send 
the messages w a and tuf,, we simply transmit the corresponding t?-, w and t b . ^ respectively. 

Decoding After receiving (yj n ], S[„j), the decoder looks for the only pair (w a , w^) such that (t^ w , t b n ^ ^ 
> y\n]i s \n]) are jointly e— typical and declares this pair as its estimate (w a ,Wb)- 

Error Analysis Without loss of generality, we can assume that (w a , wt,) = (1, 1) was sent. An error oc- 
curs, if the correct codewords are not typical with the received sequence or there is a pair of incorrect code- 
words that are typical with the received sequence. Define the events E a ^=[(T^ a ,T^ ^ Yj n ], € 
A™}, a € {1, • • • , 2 nR «} and /3 £ {1, • • • , 2 nRb }. Then, by the union bound we get 

p: = p(Eti U e °j>) 

< p{Ei l) + y, E p ( E ^)+ E p (^) w 

a=l,0^1 o.+\fi=\ a ^\fi+\ 

where £^ 1 denotes the complement set of E\^\. It can easily be verified that {Yi, Si, T®, Zf}?^ is an 
i.i.d. sequence and by [27, Theorem 15.2.1], P(Ef 1 ) — >■ as n — )• oo. Next, let us consider the second 
term 

E ^w) = E ^M'^^.^e^) 

a=l,/3^1 a=l,/3^1 

" E E P ^ B] (*[n]) i ^ [ °„ ] .y[»].S [ „]( t fn].J/[n].a[n]) 

a=l,/3^1 (tf n] ,tf n] ,J/[n],S[„])6A? 

< ^ | A n| 2 -n^(T 6 )-e] 2 -n^(T»,y,5)- £ ] (1Q) 

< 2 n - Rb 2~ n '- f/ ( T ^ +i/ ( Ta ' Y ' ,s '^ ) "^( Ta '' r6 ' Y '' s '^ ) " 3e ] 

(*) 2 "[^- J ( T!, ; y l s ' Ta )- 3e ] (11) 

where (i) holds since for j3 ^ 1, Tjy, ^ is independent of (T£, 1; Y| n j, S 1 ^) and (m) follows since T 6 and 
(T a , S) are independent and I(T b ; Y, T a , S) = I(T b ; T a , S) + I(T b ;Y\T a , S) = I(T b ; Y\T a , S), where 
I{T b ; T a , S) = 0. Following the same steps for (a ^ 1, = 1) and (a ^ 1, ^ 1) we get 

£ <2™[ R °- 7 ( Ta ' y l T6 ' 5 )^, £ P(£ Q;/3 ) < 2 ^+^- / ( To ' T6 ^l 5 )- 3£ ], (12) 

a^l,/3=l 

and the rate conditions of the TZfs{^) imply that each term tends in (9) tends to zero as n — > oo. This 
shows the achievability of a rate pair (R a ,Rb) € 1Zfs{k)- Achievability of any rate pair in C^v follows 
from a standard time-sharing argument. ■ 
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Let 

Cout :={{Ra, Rb) eR + xR + : R a + R b < sup I(T a , T b ; Y\S)\, 

I n T a(t a )w Tb (t b ) ) 

where H + is the set of positive reals. 

Theorem 2.2 (Outer Bound to Cfs)' Cfs Q Cout- 
As a consequence of Theorems 2. 1 and 2.2, we have the following corollary which can be thought of as 
an extension of [12, Theorem 4] to the case where the encoders have correlated CSI. 

Corollary 2.1: 

Cg s = sup I(T a ,T b ;Y\S). (13) 

TT T a(t a )TT Tb (t b ) 

Proof of Theorem 2.2: We need to show that all achievable rates satisfy 

R a + Rb< sup I(T a ,T b ;Y\S), 

n T a(t a )TT Tb (t b ) 

i.e., a converse for the sum-rate capacity. Following [1], let 

«m := \Ps [t _M and 77(e) := -J—\ og \y\ + ^M. (14) 
Observe that lim e ^ v( e ) = and 

neSM i<t<n ^esc-i) 

where and are the sets of all 5-strings of length n and (t — 1), respectively. 

First recall that, Vt > 1, X$ = ^ (w o ,Sf t] ) = ^ (w a , Sf t _ 1} , sfj and X b t = cj> {b) (w b ,S b t] j = 
4>f^ (Wb, S b t sf\ . Then, we can define the Shannon strategies T/ 1 G T a and T t b € 7b by putting, for 
every s a G <S a and s& € <S 6 , 

T t a ( Sa ) := 4 a) (w a ,Sf t _ lV 8 a ) , T t 6 ( Sb ) := </>f } (w^Sf^j, s 6 ) . (15) 

We now show that the sum of any achievable rate pair can be written as the convex combinations 
of conditional mutual information terms which are indexed by the realization of past complete state 
information. 

Lemma 2.1: Let T t a G T a and T b G % be the Shannon strategies induced by (jjf 1 and c^f 1 , respectively, 
as shown in (15). Assume that a rate pair R = (R a ,R b ), with block length n > 1 and a constant 
e G (0, 1/2), is achievable. Then, 

R a + Rb< J] ^/(T^T^Y^,^ = M ) + 77(e). (16) 
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Proof: Let T t := (T t a ,T t 6 ). By Fano's inequality, we get 

H(W\Y [n] ,S [n] )<H(e) + e\og(\W a \\W b \). (17) 

Observing that 

I(W;Y [n] ,S [n] ) = H(W)-H(W\Y [n] ,S [n] ) 

= iog(|w a ||w 6 |)-iJ(w|y w ,5 N ). (is) 

Combining (17) and (18) gives 

(1 - e) log(|W ||W 6 |) < J(W; Y [n] , S [n] ) + H(e) 

and 

Ra + Rb < ^log(\W a \\W b \)<^^{l(W;Y [n] ,S [n] ) + H(e)). (19) 

Furthermore, 

n 



t=i 



< £ (Y t |5 [t] ) - (y t | w, 5 [t] , y^!, , T t )] 



(in) 

t=l 



J2[H(Y t \S [t] )-H(Y t \S [t] ,T t )] 
t=i 

n 

= ^J(T t; r t |S [t] ) (20) 
t=l 

where (i) is implied by (2), in (ii) T t := (T", if) are Shannon strategies whose realizations are mappings 
t\ : S\ — > X\ for i = {a, 6} and thus (m) holds since conditioning reduces entropy. Finally, (in) follows 
since 

^V t |W,S t) S [t _i ] ,y [t _i, ) T t ",3*(j/t|w,5 t ,S[ t _i],I/[ t _i],^,4) 

= PY t \S t ,T?,T*{Vt\*t,%,$) (2D 

where the first equality is verified by (3) and (2), where x\ = t\(s\) for i = {a, b}. At this point, it 
is worth to note that by (21), one can remove Sjt-i] from (20) in the conditioning. However, we will 
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Hie] 

soon observe why it is crucial to keep it when we prove the product form. Now, let %(e) := n (i_ e ) an d 
combining (19)-(20) gives 

Ra + Rb < ilog(|W a ||W 6 |) 

^ (rh^E 7 ^ 3 ?^*!^)) + X(^) + (n - l)x(e) 

(o) 1 1 " pi™ 

*=i *=i 

i n 

= -^7(T«,r t 6 ;y t |S [t] ) + r ? (6) (22) 
t=i 

where (a) is valid since 7(T t a , T t b ; 1*1^]) < log |^|. Furthermore, 

7(T t a ,T t 6 ;y t |5 [t] ) = n £ a^??,Z?;W,<Vi]=M), (23) 

and substituting the above into (22) yields (16). ■ 
Note that, for any t > 1, I(T t a , T f & ; Yj|St, SW-i] = /x) is a function of the joint conditional distribution of 
channel state St, inputs T", if and output Yt given the past realization {S\ t -i\ = jj). Hence, to complete 
the proof of the outer bound, we need to show that PT t a ,T t b ,Y t ,s t \s lt -i] t b , y, s\[i) factorizes as in (8). 
This is done in the lemma below. In particular, it is crucial to observe that the knowledge of the past 
state at the decoder, S[ t -i], is enough to provide a product form on T a and T b . Let 

^sn : = { Wa ■. 4 a \ Wa , S [u, = Ma ) = n, r^y) ■.= { Wh ■. 4 b \w b , s \ t _ 1} = ^ = t b } (24) 

and 

Ma 

Mb 

where fi a and denote particular realizations of S? t _^ and respectively. 
Lemma 2.2: For every 1 < t < n and fi G the following holds 

^•^.ftis^^^.^.y.alM) = Ps(8)Py\ S ,T>My\ a >r> tb K-(r)tf»( tb )- w 

Proof: Let S := (S t , Sf, 5 t 6 ) and s := (s, sf , sj). Observe that 

Prr^^sAS^itM^y,^) = E ^t^s^MV 6 ,^) 
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= E E p Y\s^My\ s ^ a ^ tb ) p s,Tr,n\s lt ^,t a ,t%) (27) 

s?eS a s b t es b 

where the second equality is shown in (21). Let us now consider the term Ps,T t a ,T£\s lt -i] ( s > *°> ^If 1 ) above. 
We have the following 

= E E EE P W,S,t 11 ,Sf M AT; ) T ( 1S M (w 1 /'a,//b,S,fVl/i) 

W a eW a W b £W b Ma Mb 



(«) 



E E EE i V.S5_ 1] ,Sf t _ 11) 3? 1 I?|S [t _ 1] (w,Ma,/ib,t°,«V) 



Ps ^ S ) E E EE^t^rf'h.A.), i=a,6} f W,S|i_ 11 ,S [ »_ 1] |S [t _ 1] (w,Ma,Mb|M) 
w a £W a w b £\V b Ma Mb 

(m) v-^ v-^ 11 



Ps ^ S ) E E EEV^'Wi), |W I IWJ^-i]' 5 !*-!]! 5 !*- 1 !^'^!^ 



(id) 



™a6W »i,gWl, Ma Mb 

Ps(s)E P 5 rt _ 1] |S [t _ 1] (^a|M)E P ^- 11 l^- 11 (Mb|^) 



Mb 



(vi) 



^ lW„| 1 {*°=^ a) ( w '"' 1 -)> ^ \Wh\ 1 {^=^ 6) K,Mb)} 
Ps(s)E^ rt _ 1] |5 [t _ 1] (/^a|/x) E hXnE^I^-u^blM) E H^J 

Ma ».eT; a (t») 1 a| Mb «;»eT» b (t») 1 b| 

P s (s)E^ rt _ 1] |5 [t _ 11 (Ma|M)^(^)E P ^_ 1] |5 [t _ 11 (^b|^)vr^(i 6 ) 

Ma Mb 



( = l) P s (s)^ a (t a )^ b (t b ) (28) 

where (i) is due to (2) and (15), (ii) is valid by (15), (Hi) is due to (2), (iv) is valid by (1) and (15), (v) 
is valid due to (24) and (vi) — (vii) is valid due to (25). Substituting (28) into (27) proves the lemma. 



We can now complete the proof of Theorem 2.2. With Lemma 2.1 it is shown that the sum of any 
achievable rate pair can be approximated by the convex combinations of rate conditions given in (7) 
which are indexed by \i G and satisfy (8) for joint state-input-output distributions. More explicitly, 
we have 

Ra + Rb < E anHTf^YtlSuS^^ri + Tiie) 

Me«s<"> 

= E ^IiTt^YtlSt^^^+vie) 
neSM 
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< sup I(T t a ,T b ; Y t \S t )+ V (e) 

(Tf.(t°K t (* fc ), m) 

< sup J(T t a ,T t b ; y t |S t ) + 17(e), 

(TT T a(t")TT Tb (t b )eU) 

where I(T®, T b ; Y t \St) n ^ a ^) 7T ' j b (t b ) denotes the mutual information induced by the product distribution 
7r r°(* a ) 7r T f '(^) an d tne secon d step is valid since I(T?,T£;Y t \St, 5[ t _i] = (i) is a function of the joint 
conditional distribution of channel state 5*, inputs T t a , T t b and output Y t given the past realization (£[ t _i] = 
n). Hence, since lim e ^o v( € ) = 0, any achievable pair satisfies R a +Ri, < sup 7rra (t a )-K Tb (t b ) I{T a , T b ; Y\S). 

m 

Having achievability and converse proof in hand, we can now prove Corollary 2.1. 

Proof of Corollary 2.1: We need to show that 3 (R a ,Rb) € Cjn achieving (13). We follows steps 
akin to [27, p.535] where discrete memoryless MACs are considered. Let us fix 7Pra(i a )7Prf>(i 6 ) and 
consider the rate constraints given in Ctn 

I(T a ;Y\T b ,S) = H(T a \T b , S) — H(T a \T b , Y, S) = H(T a ) — H(T a \T b , Y, S) (29) 
I(T b ;Y\T a ,S) = H(T b \T a , S) — H(T b \T a , Y, S) = H(T b ) — H(T b \T a , Y, S) (30) 

and 

I(T a ,T b ;Y\S) = H(T a ,T b ) - H(T a ,T b \Y,S) 

= H(T a ) + H(T b ) - H(T a \T b , Y, S) — H(T b \Y, S), (31) 

where (29), (30) and (31) are valid since T a and T b are independent of each other and independent of 
S. Observe now that for any ir Ta (t a )ir Tb (t b ), I(T a ;Y\T b , S) + I(T b ;Y\T a , S) > I(T a ,T b ;Y\S) since 
H(T b \Y, S) > H(T b \T a ,Y, S). Therefore, the sum-rate constraint in Cjn is always active and hence, 
there exists (R a ,Rb) 6 Cin achieving (13). ■ 
We now present a number of remarks. 

Remark 2.1: One essential step in the proof of Theorem 2.2 is that, once we have the complete CSI, 
conditioning on which allows a product form on T a and T b , there is no loss of optimality (for the sum- 
rate capacity) in using associated memoryless team policies instead of using all the past information at 
the receiver. 

Remark 2.2: For the validity of Corollary 2.1, it is crucial to have the product form on (T a ,T b ). If 
this is not the case, we would get that I(T a ;Y\T b , S) + I(T b ; Y\T a , S) = H(T a \T b ) + H(T b \T a ) - 
H(T a \T b , Y, S)-H(T b \T a , Y, S) and J(T; Y\S) = H(T a \T b ) + H(T b )- H(T a \T b , Y, S)-H(T b \Y, S). 
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Therefore, it is possible to get an obsolete sum-rate constraint in Ctn and hence, achievability of Cp S is 
not guaranteed. 

Remark 2.3: The main difference between the problem that we consider here and the one considered 
in [1] is the encoders' information at the decoder. More explicitly, in [1], the information at the encoders 
are available at the decoder. From this perspective, the main contribution of the result of this section 
can be thought as showing that when the decoder has no knowledge of encoders' CSI, by enlarging the 
input space, there is no loss of optimality (for the sum-rate capacity) if the optimization is performed by 
ignoring the past CSI at the encoders given that the decoder has complete CSI. 

A. Asymmetric Causal Noisy CSIT and Noisy CSIR 

In many practical applications, CSI first needs to be estimated by the receiver, such as using training 
methods, and then the receiver feeds back this information to the transmitters. This motivates us to 
consider a scenario where the decoder is first provided with noisy CSI (where the noise models the 
estimation error) and then, it feeds back this noisy CSI to the encoders thorough independent but noisy 
feedback links as shown in Fig. 2, where N a and Nf, denote independent noise processes. The two 
encoders have causal noisy versions of the state information St at each time t > 1, Sf G S a , S$ G <S&, 
respectively, and the decoder has access to noisy CSI at time t ,Sl G <S r . Based on the physical setup, 
the joint distribution of (St, Sf, S%, S%) satisfies 

We also assume that the channel is memoryless (i.e., (3) holds) and that 

^S [n ],Sg, ]1 Sf n]1 S[- n] ,W.,W t (*[n] > sf„]> 4*]' s\ n] ,w a , w h ) 

"11 

We first provide inner and outer bounds on the capacity region and an expression for the sum-rate capacity, 
akin to Theorems 2.1, 2.2 and Corollary 2.1, respectively, when the feedback links are noisy. In the next 
subsection, by assuming that the CSITs are asymmetric deterministic functions of CSIR we obtain the 
full capacity region. 

(n) 

A code can be defined as in Definition 2.1, except ip : <S™ x y n — > W a x W&. P e , achievable rates 
and the capacity region, Cns, are defined similarly. The sum-rate capacity is denoted by Cff s . We also 
keep Definition 2.2 and slightly change the associated rate region. 
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Fig. 2. The multiple-access channel with causal noisy CSIT and noisy CSIR. 



For every memoryless stationary team policy ir denned in (2.2), let TZns(^) denote the region of all 
rate pairs R = (R a ,R b ) satisfying 

R a < I(T a ;Y\T b ,S r ) (34) 
R b < I(T b ;Y\T a ,S r ) (35) 
R a + R b < I(T a ,T b ;Y\S r ) (36) 

where S r , T a , T b and Y are random variables taking values in S r , T a , % and y, respectively and whose 
joint probability distribution factorizes as 

Psr,T°,T>,Y(s r ,t a ,t b ,y) = Psr(snPY\T«,T»My\ ta > tb ' Sr )*T»(n*T»(t b ). (37) 

Let Cin '■= co^ Utt ^■Ns{ 7T ) S j denotes the closure of the convex hull of the rate regions TZ^si 71 ) given 
by (34)-(36) associated to all possible memoryless team policies as defined in (4). 

Remark 2.4: It should be observed that once we have the Markov property (32), the setup with noisy 
CSIR described above is no more general then the setup with complete CSIR. This is because, one can 
define an equivalent channel with conditional output probability 



peg 

r Y\X a ,X b ,S r 



iy\x a ,x b ,s r ) = Y J PY\x«,x»,s(y\x a ,x b ,s)P slSr ( S \ S r ) (38) 
ses 

which follows from (33). With (38), the noisy CSIR problem reduces to the complete CSIR problem 
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since we can now define a new channel with state S r and 

Ps;,s>,s;(slsls r t ) = PstwWWPstwtfWPsttf). 

Hence, the proofs of the corollaries below follow directly from the complete CSIR case. 
Corollary 2.2 (Inner Bound to Cns) : Cin Q Cns- 
Corollary 2.3 (Outer Bound to Cns)' Cns Q Cout, where 

Cout:= \(Ra,Rb) GIR + xE+ :R a + R b < sup I(T a ,T b ;Y\S r )\ . 

Corollary 2.4: 

C% s = sup I(T a ,T b ;Y\S r ). (39) 

n T a(t a )n Tb (t b ) 

This corollary indicates the fact that even if we have noisy CSIR, Shannon strategies are still optimal for 
the sum-rate capacity as long as conditioning on the past CSIR gives a product form on these strategies. 

B. CSITs are Deterministic Functions of CSIR:Causal and Non-Causal Cases 

Since the computation of optimal strategies in Corollaries 2.1 and 2.4 requires an optimization over 
extended input alphabets, it is worth to consider the case in which optimization can be performed over the 
input alphabets X a and Xj,. The usual approach is to assume that the transmitters have access to partial 
(through a deterministic function such as a quantizer) state information at the decoder. In particular, let 
SI = P(Sl), where f : S r -> S it i = {a, b}. 

The equivalent channel defined in (38) shows that the causal setup of this problem is no more general 
than [1]. Hence, the main contribution of this subsection is to provide a single letter characterization for 
the capacity region for the non-causal case. The expression shows that the result of [1] also holds for 
non-causal coding. 

We keep the channel codes definition identical for the causal and non-causal cases, except for the 
non-causal case we have; <ffi : Sf x Wj — > Xf, i = {a, b}, t = 1, • • ■ , n. 

Let C^ s and C^ s denote the capacity region for the causal and non-causal cases, respectively. We 
need to modify Definition 2.2 in order to take the current CSI into account. 

Definition 2.4: A memoryless stationary (in time) team policy is a family 

5 = {vf = (* x «\sMnsl\*x*\sMJ\s r ))) € V{X a ) x V(X b )) . (40) 
For every W defined in (40), Tl%s(^) denotes the region of all rate pairs R = (R a , R b ) satisfying 

R a < I(X a ;Y\X b ,S r ) (41) 
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R b < I(X b ;Y\X a ,S r ) (42) 

Ra + R b < I(X a ,X b ;Y\S r ) (43) 

where S r , X a , X b and Y are random variables taking values in S r , X a , X b and y, respectively, and 
whose joint probability distribution factorizes as 

Ps>~,x«,x\Y(s r ,x a ,x b ,y) 

= P 5 .(^)P y | X%x6I5 .(y|x^x^^)7^ Xa| ^(x1r(^))7^ x6|56 (x 6 |/ ^ '(^))• (44) 

Let co ^Us-^vs^)^ denote the closure of the convex hull of the rate regions IZ^g^n) given by 
(41)-(43) associated to all possible memoryless stationary team polices as defined in (40). 
Theorem 2.3: C® s = C® s = co( (J^ Tl% s {Tt] 



For the achievability proof, see [1, Section III] and observe that any rate which is achievable with causal 
CSI is also achievable with non-causal CSI. For the converse proof of the non-causal case see Appendix 
A. The proof for the non-causal case is realized by observing that there is no loss of optimality if not 
only the past, as shown in [1], but also the future CSI is ignored given that the receiver is provided with 
complete CSI. 

It should be noted that the causal result can be thought of as an extension of [5, Propositon 1] to a 
multi-user case and the non-causal case is also considered in [13, Theorem 3] where inner and outer 
bounds are provided. 

Remark 2.5: Following [1, Remark 1], it is worth to emphasize that for the above argument to work, 
it is crucial that the past and future state realizations only affect the team policies and that the state 
information available at the decoder contains the one available at the two transmitters. 
In particular, the latter fact plays a role in the converse part of the coding theorem by enabling the 
decoder to ignore the past channel outputs, given that the channel is memoryless, without any loss of 
optimality. 

Let us investigate this remark via considering the setup in Section II in order to observe that for the 
non-causal case the optimality of Shannon strategies are not guaranteed. Recall that, we have 

n 

/(w ; y [n] ,5 [n] )< J] [H{Y t \s [n] ,Y M ) -fr(y t |w,s [n] ,y [t _ 1] ,T t )] (45) 
*=i 

where T t := (T",T t 6 ). Consider now the right hand side of (45) and observe that 

P Y t \W,S [nh Y[t-i] ,T?,T t b (Vt I W, S[„] , 2/[ t _i] , tf, t\) 

\ "* r> ( n , |„ n a Jj j_a j_b\ r> ( n a n b 



P Y t \S t ,S?,S?,Tf,T t b (yt\s t , St, S t ,tt, t t )P S?l S^\Y lt _ lh S t ( S t^ s t \Vlt-l], s t), 
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and therefore, the past channel outputs cannot be ignored. Recall that in the causal setup the conditional 
probability P S a S b^ Y[t _ lh s t ( s ti s t\y[t-i]i s t) is independent of past channel outputs. 

III. Asymmetric Delayed, Asymmetric Noisy CSIT and Complete CSIR 

Consider the problem defined in Section II where the two encoders have accesses to asymmetrically 
delayed, where delays are d a > 1 and d b > 1, respectively, and noisy versions of the state information 
St at each time t > 1, modeled by S^_ d € S a , S b _ db € S b , respectively. The rest of the channel model 
is identical and hence, (1), (2) and (3) are valid throughout the section. We also assume that St is fully 
available at the receiver. A code can be defined as in Definition 2.1, except now 

. S t-d a x Wa x a , t = l,2,...n; 
4 b) :«S*- dt x W b ^X b , t = l,2,...n. 1 

Let Cdn denote the capacity region of the delayed setup. 

In the main result of this section the team policies are composed of probability distributions on the 
channel inputs rather than Shannon strategies. 

Definition 3.1: A memory less stationary (in time) team policy is a family 

Lt = {tt = (7rx.(-), e V{X a ) x V{X b )} . (46) 

For every memoryless stationary team policy n, 1Zdn{k) denotes the region of all rate pairs R = 
(R a ,Rb) satisfying 

R a < I(X a ;Y\X b ,S) (47) 

R b < I(X b ;Y\X a ,S) (48) 

R a + R b < I(X a ,X b ;Y\S) (49) 

where S, X a , X b and Y are random variables taking values in S, X a , X b and y, respectively and whose 
joint probability distribution factorizes as 

P S ,x«,x»,Y(s,x a ,x b ,y) = Ps(s)P Ylx « iXb:S (y\x a ,x b ,s)irx«(x a )ir Xb (x b ). (50) 

Let co ^ [J- IZdn^)^ denotes the closure of the convex hull of the rate regions TZdn(^) given by 
(47)-(49) associated to all possible memoryless stationary team polices as defined in (46). 

'Obviously, when di > t, I = a, b then X t a = (j>\ a) (W a ) and X h t = tj>f \W b ). 
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Fig. 3. Cooperative multiple-access channel with noisy state feedback. 

Theorem 3.1: Cdn = co^ Us- ^djv(^)^ • 
Achievability can be shown by random coding arguments. For the converse, see Appendix B. 

Remark 3.1 (Strictly Causal Case): When d a = d b = 1, Theorem 3.1 is the capacity region of the 
setup with strictly causal CSITs. In [14] and [16], achievable rate regions are provided for the case when 
the channel is driven by two independent states (with no CSIT). When the encoders have strictly causal 
CSI (not noisy/not asymmetric), the authors proposed a region which is based on sending a compressed 
version of the state information available at the encoders to the decoder. Theorem 3.1 verifies that since 
the full CSI is available at the receiver and since the decoder does not need to access the current CSI at 
the encoders, there exists no loss of optimality if the past information at the encoders are ignored. 



IV. Cooperative FS-MAC with Noisy CSIT 

We now consider the last scenario of the paper. Assume a common message is provided to both 
encoders and one of the encoders has its own private message. Assume further that the encoder with the 
private message causally observes noisy state information, whereas the encoder with the common message 
only observes noisy state information with delay d a > 1. Let the common and the private messages be 
W a and W b , respectively, and S° t _ d d a > 1, and Sk, denote the CSI at encoder a, b, respectively, where 
(S t ,S?,S?) satisfies (1) and (2). Hence, X? = (j)^ (W a , S^ t _ da] ) and 
3. Let Cc denote the capacity region for this channel. Recall that T b = X b b - 

Definition 4.1: A memory less stationary (in time) team policy is a family 



, -4 b \w a ,W b ,S^ t] y, see Fig. 



Il = {n= {n Xa>Tb (; •)) G V(X a ) x V{T h )} 



(51) 
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of probability distributions on (X a ,%). 

Let for every tt, 1Zc{k) denote the region of all rate pairs R = (R a , R b ) satisfying 

R b < I{T b ;Y\X a ,S) (52) 
R a + R b < I(X a ,T b ;Y\S) (53) 

where S, X a , T b and Y are random variables taking values in S, X a , % and y, respectively and whose 
joint probability distribution factorizes as 

P S ,x°,T>,Y(s,x a ,t b ,y) = P s (s)P YlXa , TbyS (y\x a ,t b ,s)iT X ^ Tb (x a ,t b ). (54) 



Let 1Zc(tt) J denotes the closure of the convex hull of the rate regions IZcijt) given by (52) and 

(53) associated to all possible memoryless stationary team polices as defined in (51). 

Theorem 4.1: Cc = co ^ \J~ TZc(tt)^J . 
See Appendix C for the proof. 

Remark 4.1: Theorem 4.1 shows that when the common message encoder has no access to the current 
noisy CSI (since the delay d a > 1), by enlarging the optimization space of the other encoder, via Shannon 
strategies, the past CSI can be ignored without loss of optimality if the decoder is provided with complete 
CSI. 

One important observation to be made in the cooperative scenario is that we do not require a product 
form on the pair (X a ,T b ) (see (54)). In connection with this observation, let us consider the following 
noisy CSIR setup. 

Let the encoder with the private message causally observe noisy state information, whereas the encoder 
with the common message has no CSI, i.e., Xf = ^ a) (W a ) and X b t = 4>f\w ai W b , St), and the decoder 
also has access to noisy CSI at time t, S 7 t G S r ; see Fig. 4. Let Cq denote the capacity region for this 
setup. Let for every memoryless stationary team policy tt defined in (51), TZq(tt) denote the region of 
all rate pairs R = (R a ,R b ) satisfying, 

R h < I(T b ;Y\X a ,S r ) (55) 
R a + R b < I(X a ,T b ;Y\S r ) (56) 

where S r , X a , T b and Y are random variables taking values in S r , X a , % and y, respectively and whose 
joint probability distribution factorizes as 

P S r,x«,T»,Y(s r ,x a ,t b ,y) = P S r(s r )PY\x»,T»My\x a ,t b ,s r )Tr x «, T >(x a ,t b ). (57) 
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Fig. 4. Cooperative multiple-access channel with noisy CSIT and CSIR. 



Let co^ 1Z(i(it)j denotes the closure of the convex hull of the rate regions TZ^(jt) given by (55) and 
(56) associated to all possible tt as defined in (51). 
Theorem 4.2: Cg = co ^ \J~ ^g(vr)^ ■ 

Proof: The achievability proof is identical to that of Theorem 4.1. Converse proof is also similar 
and therefore, we only provide a sketch. In particular, observe the following lines of equations for the 
converse proof of the condition on R b : 

I(W b ;Y [n] ,S[ n] ) < I(W b ;Y [n] ,S[ n] \W a ) 

n 

= J2 [^(^^ri^py^],^)-^,^!^^^!],^,^) 



t=l 



(0 



(ii) 
< 



(Hi) 



n 

E [H(Y t | S{ t] , Y [t _ 1} ,W a )- H(Y t \S\ t] , Y [t _ 1} , W a , W b ) 
t=i 

n 

t=l 

n 

£ [ H ( Y t\ S lt], X ?) -H(Y t \Sl t] ,Y [t _ 1] ,W a ,W b ,X-,T t b 



t=i 



E [H{Y t \S\ tV X?) -H{Y t \S\ t] ,XlT?) 



t=i 

n 



(58) 



t=i 



where (i) follows since state is i.i.d., where Xf is the Shannon strategy induced by encoder b at time t 
as shown in (113), and (ii) is valid since conditioning reduces entropy, and (Hi) is valid since state is 
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i.i.d. and can be shown along the similar lines as (121). Hence, one can directly obtain that 

^ < E «M^( r t b ;^l^^^[t-i]=M + r?(e) (59) 

^G<S<"> 

R a + R b < E a^/(X t °, 7*; y t |SJ, ^ = f i T ) + V (e) (60) 

where a^ r := ^Ps r t _ 1 (^r) and 77(e) is given in (14). We now need to show that the joint distribution 

p xt,it,Y t ,s;\S{ t _ 1] (z a > * 6 > V, s lAir) satisfies (57). Let 7r£. jT ,(x a , t 6 ) := Px^T^s^ (x a , t b \n r ) and observe 
that 

= EE Py t |x^x^5 t (yk^^ 6 (4)^)^^5 t ,5 r (4^ t ^0^^T ^ |5 rt _ 1 /^^^Vr) 

= 7^ a>Tt (a; ,t b )Psr( a '-)JV t | X . )3?i s r (y|x ,* 6 , a '-) (61) 

where the first equality is verified by (3) and by the fact that is independent of (St, <Sf , ■ 

Remark 4.2: It should be observed that unlike Theorem 4.1 and results in the previous sections, for 
the validity of Theorem 4.2, it is not required to have a Markov condition on Ps t ,s b ,S r ( s ti s t> s t) such 
as the one given in (32). Furthermore, the result also holds with no CSIT, i.e., S r = is allowed, and in 
this case Theorem 4.2 is as an extension of [18, Theorem 4] to a noisy setup. 

Note that for the setup given in [18, Theorem 4], Theorem 4.2 provides an equivalent characterization. 
Recall that in [18, Theorem 4] the informed encoder has full CSI, i.e., X h t = 4>f\w a , W b , S^), both the 
uniformed encoder and the decoder has no CSI and the capacity region, Cas, is given as the closure of 
all rate pairs (R a ,R b ) satisfying 

R b < I(U;Y\X a ) (62) 
R b + R a < I(U,X a ;Y) (63) 
for some joint measure on S x X a x X b x y x U having the form 

P Y \x°,x<>,s(yK, x ^ s)Px»\u,x°,s( xb \ u > x ^ s)P s (s)P x ^u( x ^ «). (64) 

where \U\ < \S\\X a \\X b \ + 1. On the other hand, for this setup, Theorem 4.2 gives the capacity region, 
Cp S , as co^ |J # 7*^(77)^ where 1Z' c (tt) denotes the region of all rate pairs R = (R a ,R b ) satisfying 

R b < I(T;Y\X a ) (65) 
R a + R b < I(T,X a ;Y) (66) 
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where PY,T,x a ,x b ,s(y, xa i xh i s ) factorizes as 

PY\x«,x»,s(y\x a i *\ s)P xb \ s ^(x b \s, t)P s (s)n x%T (x a , t), (67) 

and T : S -> AT 6 . 

Although the relation between an auxiliary variable and Shannon strategies is well understood for the 
single-user case (e.g., see [22, Section 3.2]), we believe that it requires more attention in the multi user 
case; in particular, note the difference between \U\ and |T|. Hence, we provide a proof for Cp S = Cas, 
see Appendix D. 

We conclude this section with the following remark. 

Remark 4.3: For the validity of converse proof of Theorem 4.2 it is crucial that Xf 1 only depends on 
W a . To be more explicit, let us assume S r = and consider the following steps of the converse 

n 

I(W b ;Y [n] ) < ^frCytlYit.!],^)-^!^.!],^,^,^],!*) 
t=l 

n 

= J2 H ( Y t\ Y [t-ih X [n])- H ( Y t\yit-i],^,T b ). (68) 
t=i 

Since St is not available to the decoder, the above equality is valid if and only if does not provide 
any information about Sp Hence, in other words, whether CSITs are noisy or not, if there is no CSI 
or noisy CSI at the decoder, the arguments above would fail if the uninformed encoder observes some 
degree of CSI, i.e., d a < oo so that X^ carry some information about (St, S t , S t ). 

V. Examples 

We present two examples. In the first example we discuss the state dependent modulo-additive MAC 
with noisy CSIT and complete CSIR (as in Section II) and show that the proposed inner and outer bounds 
are tight and yield the capacity region. In the second example we consider the problem defined in Section 
II-B where the channel is a binary multiplier MAC with state being an interference sequence. 

A. Modulo- Additive FS-MAC with Noisy CSIT and Complete CSIR 

Recall that both the achievable regions and the sum-rate capacities of Sections II and II-A are given in 
terms of Shannon-strategies. Hence, their computation requires an optimization over an extended space of 
the input alphabet to a space of strategies and is often hard; in fact, very few explicit solutions exist even 
in the single-user case. In [6] symmetric, modulo-additive, single-user finite-state channel with complete 
CSIT is considered and a closed-form solution for the capacity is derived. Based on this result, we 
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now consider the modulo-additive FS-MAC with asymmetric noisy CSIT and show that for the sum-rate 
capacity, the optimal set of strategies has uniform distribution. This enable us to determine the entire 
capacity region by observing that under the uniform distribution both inner and outer bounds are tight. 

To be more explicit, we consider a two-user FS-MAC in which the channel noise, defined by a 
process {Z t }^ =1 , is correlated with the state process. The channel is given by Y = X a X b Z where 
X a = X b = y = Z = {0, ■ ■ ■ , q — 1} and Z, is conditionally independent of {X a , X b ) given the state S 
and in the sequel addition (and subtraction) is understood to be performed mod-<j. Assume further that 
we have the setup of Section II. The following theorem is the main result of this example and can be 
though as an extension of [6, Theorem 1] to a noisy multi-user setting. 

Theorem 5.1: The capacity region of the modulo-additive FS-MAC defined above is given by the 
closure of the rate pairs (R a ,Rb) satisfying 

R a < log q - H min 

R b < log q - H min 

R a + R b < log q-H min (69) 

where H min := min^^ H(Z + t a (S a ) + t b (S b )\S). 

Proof: First, recall the rate condition given in Theorem 2.2; 

Ra + Rb < H(Y\S) — H(Y\T a , T b , S). (70) 

The sketch of the proof is to first determine the optimal distributions of t a , t b , the distributions achieving 
the sum-rate capacity, and then concluding with the fact that these distributions yield the same inner 
bound. Let us first consider H(Y\T a , T b , S). Clearly, P Y \x-,x\s(y\x a , x b , s) = P z \ S {v - x a - x b \s) and 
H(Y \T a , T b , S) > mm t a jtb H{Y\T a = t a , T b = t b , S). Observe that 

P Y \T°,T»,s(y\t a , t\ S) = Y, P Y\T°,T»,S°,S\s(y\t a , t b , S a , S b , S )P 5%56 | 5 ( S a , S b \ S ) 

s a ,s b 

= £ P Z{S (Z = y - t a (s a ) - t b (s b )\s)P s ^ Sbls (s a , s b \s) 

s a ,s b 

= Pz+f>(S°)+t"(S>>)\s(y\ s )- ( 71 ) 

where the second step is valid since Z is conditionally independent of (S a ,S b ) given S. Therefore, 
H(Y\T a = t a , T b = t b , S) = H(Z + t a (S a ) + t b (S b )\S). Let (t a *,t b *) be two mappings from S a to X a 
and S h to X b for which H(Y\T a = t a *,T b = t b *, S) = H min . Now, by Corollary 2.1, we have 

H(Y\S) - H(Y\T a ,T b ,S) 



= sup 
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< sup H(Y\S) - F min , (72) 

7T T » (t a )7T T b(t b ) 

and we now determine the policies {ttt°- {t a ) , t a G 7^} and {7iYi.(t b ), G 7^} achieving the supremum 
above. Let us first define the following class of strategies 

V := {t a T }, where t a T (s a ) = t a *(s a )+T, t = 1,--- ,q (73) 
T b * := {t b T }, where t b (s b ) = t b *(s b ) - r, r = 1, ■ ■ ■ , q. (74) 

It should be noted that H(Y\T a = t a *,T b = t b *,S) = H(Y\T a = t a T ,T b = t b ,S) since H(Y\T a = 
t a ,T b = t b , S) = H(Z + t a (S a ) + t b (S b )\S). Note that H(Y\S) < log \y\ = logq, but if we choose T a 
and T b uniformly distributed within T£ and T b *, respectively (with zero mass on strategies not in 7^* 
and T h *), we would get 

Py\s{y\s) = £ E E ^y|T^^^,5(yr^ b ^ a ,s fe ^)^^^|5(s a ^^) 
s°-,s b t"-eT a * t b &T b * q 



= ^ P s«,s>\s(s a ,s b \s)±Yl P z\s(y- ta (s a )-t\s b )\s) 

® E^,,i,(^ s l4 E 1 

(iii) 1 



<7 



(75) 



where (i) valid since T a and T 6 are uniformly distributed, (ii) is due to (74) (i.e., follows from the fact 
that t b G 7^* traces all possible values of Z) and finally, (in) is valid since |7^*| = q. Therefore, we get 
that C^ s = log q — H m i n which is achieved by 

7T T «(t a ) = J, vt a e r;, vr T6 (t 6 ) = 1 vt 6 g r 6 *. (76) 

Let us now consider the inner bound. In particular, we need to show that the sets of policies in (76) give 
H{Y\T a , S) = H(Y\T b , S) = logq. Consider H(Y\T a , S) and observe that 

P YlT ^ s (y\t a ,s) ( = ] £ £ P YlTb ^ s%sb , s (y\t\t b ,s a ,s b ,s)±P s%Sbls ( S a ,s b \s) 
s a ,s b t b eT b 

= Y, p s^s b \s(s a ,s b \s)- Yl Pz\s(y-t a ( S a )-t\s b )\s) 

s°,s b q t b &T b ' 

(77) 



s a ,s° 

1 



q 
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where (iv) is valid since T b is uniformly distributed and (v) is due to (74) (i.e., follows from the fact 
that t b G T b traces all possible values of Z). Thus, H(Y\T a , S) = logg. It can be shown similarly that 
under (76) H(Y \T h , S) = log q. ■ 
Finally, it is easy to see that when there is no side information at the encoders and at the decoder the 
capacity region of modulo-addtive FS-MAC is given by the closure of rate pairs (R a ,R b ) where 

Ra <\ogq-H(Z) 

R b <\ogq-H(Z) 

R a + R b <\ogq-H(Z). (78) 

Observe that we have 

H(Z + t a (S a ) + t b (S b )\S) < H(Z\S) + H(t a (S a ) + t b (S b )\S) 

H min = mm H (Z + t a (S a ) + t b (S b )\S) < mm \h(Z\S) + H(t a (S a ) + t b (S b )\S) 

t a ,t b t a ,t b L 



{vi) 



H{Z\S) 



{vii) 

< H(Z) 

where (vi) can be achieved with any deterministic mapping and (vii) is valid since Z and S (and hence 
S) are correlated. Therefore, availability of state information strictly increases, by an amount of at least 
I(S; Z), the capacity region of the modulo-additive FS-MAC. 

B. Binary Multiplier FS-MAC with Interference 

Consider the binary multiplier MAC with state process interfering the output, namely Y = X a X b © S 
where X a = X b = y = S = {0, 1}. Assume further that the communication setup is given as in Section 
II-B with S r = S © Z r where Z r ~ Ber(p r ) is Bernoulli with P(Z r = 1) = p r . We now show that 
the capacity region, with both causal and non-causal coding, of this channel is given by the closure of 
(Ra, R b ) where R a < 1 - H(S\S r ), R b < 1 - H(S\S r ) and R a + R b < 1 - H(S\S r ). 

First recall the capacity region given in Theorem 2.3 and observe that H(Y\S r , X a ,X b ) = H(X a X b ® 
S\S r ,X a ,X b ) = H(S\S r ,X a ,X b ) = H(S\S r ), where the last equality follows from (32). Hence, 
input distributions do not effect H(Y\S r , X a , X b ). Obviously, H(Y\S r ) < 1, H(Y\S r ,X a ) < 1 and 
H (Y\S r , X b ) < 1 and we now show that equalities can be achieved. More explicitly, we have the 
following optimizing distributions which can be shown using basic inequalities 

argmax H(Y\S r ) = {7r x .| S .(0|/ a (0)) = 7r x .| S .(0|/°(l)) = 0.5, 
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*x»\s<>W b (0)) = vr x ^(0|/ 6 (l)) = 0.5} (79) 
argmax H(Y\S r ,X a ) = {7r x .| S .(0|/ a (0)) = ^.^.(Ol/^l)) = 0, 

7r x a| S a(a:»|/»(s'')),7r xb|sb (a: t '|/ i '(s")) 

7T Xi ,| 5 ,(0|/ b (0)) = 7T xblsb (0\f b (l)) = 0.5} (80) 

argmax H(Y\S r ,X b ) = { vr^, , (0 [ / fo (0) ) = i: Xb \ Sb (Q\f b (l)) = 0, 

tx-|s-(0|/ 6 (0)) = 7r x6 | 5 ,(0|/ b (l)) = 0.5} (81) 

and in the rest, let us show that these yield the equalities in the conditional entropies. Let us start with 
R a , i.e., H(Y\S r ,X b ). Note that 

H(Y\S r ,X b )= E Psr(s r )n xi \ Si (x b \f b (8 r ))H(Y\S r = 8 r ,X b = x b ). (82) 

s r e{o,i}x i, e{o,i} 

Substituting (81) in (82) gives 

H(Y\S r ,X h ) = P Sr (0)H(X a Q)S\X h = l,S r = 0) + P S r(l)H(X a ®S\X b = l,S r = 1). (83) 

We next show that under (81) H(X a © S\X h = l,S r = 0) = 1, for which it is enough to show that 
Px'<bs\x>M°\ 1 > q )= - 5 - We have 
Px«®s\x b ,s r (®\l> 0) 

= E E ^©5|5^,^,5-(0| 5 ,x a ,l,0)P S | 5 ,( S |0)7r X£l | 5£l (x a |r(0)) (84) 
se{o,i}x»e{o,i} 

= P S | S r(0|l) [0.5P x ><bs\s,x>jc>M \ > 0, 1, 0) + 0.5P Xa(BS \s,x^x\s^m 1, 1, 0)] 
+P S | S r(l|l) [0.5P x: . e s|s 1 x- ) x» ) sr(0|l,0, 1,0) + 0.5P Xa(sS \sx°X b ,sr(0\l, 1, 1, 0)] 
= 0.5, 

where (84) is due to (32) and (40). We can similarly show that Px a ®s\x b ,S r (^\^^) = 0-5 and hence, 
H(X a ®S\X b = l,S r = 1) = 1. Therefore, H(Y \S r , X b ) = 1. Since the above derivation is symmetric, 
under (80) H(Y\X a ,S r ) = 1. 

It now remains to show that with (79) H(Y\S r ) is equal to one. It should be observed that 

Px-x»eiS\s r (-\s r ) 

= E Px«x W «x\s(^ a ,x\s)n XalSa (x a \f a (s^ 
x a ,x b ,se{o,i} 

( = ] 0.25 £ P S |s.(s|s r ) E ^.^es|x«,x»,s(-|a: ,^,a) 
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= 0.5 

where (i) is due to (32) and (40), (u) is due to (79) and the last step is valid since for given s, there 
are only two pairs of (x a ,x b ) for which Px a x b ®s\x a ,x b ,s{'\ x<1 i s) = 1 (and zero for the other twos). 
Hence, H{Y\S r ) = 1. 

Finally, it can be easily shown that the capacity region of Y = X a X b © S without CSIT and CSIR 
is given by the closure of (R a , R b ) where R a < 1 - H(S), R b < 1 - and i? a + i? 6 < 1 - ^(5). 

Therefore, availability of noisy CSI at the encoders (both causal and non-causal) and at the decoder 
increases the capacity region by an amount of I(S; S r ). 

VI. Conclusion and Remarks 

We have considered several scenarios for the memoryless FS-MAC with asymmetric noisy CSI at 
the encoders and complete and noisy CSI at the receiver. When the encoders have access to causal 
noisy CSI, single letter inner and outer bounds, which are tight for the sum-rate capacity, are obtained. 
Furthermore, under the assumption that CSI at the encoders are provided by the decoder through noisy 
feedback links, we demonstrate that a tight converse for the sum-rate capacity still holds if the decoder 
also observes noisy CSI. In order to reduce the space of optimization, from Shannon strategies to channel 
inputs, we consider the case where CSITs are asymmetric deterministic functions of noisy CSIR. The 
equivalent channel demonstrates that the causal setup of this problem is considered in [1] and a single- 
letter characterization for capacity region is provided. Hence, we also considered the non-causal setup 
and showed that the causal and non-causal capacity regions are identical. 

When the decoder does not need to access the current CSI at the encoder, which matches with the 
delayed scenario, we observe that a single letter characterization of the capacity region can be obtained 
when the channel state is an i.i.d. stochastic process. We further discuss a cooperative scenario and show 
that when the common message encoder does not have an access to the current noisy CSI, due to delay, 
it is possible to obtain a single letter expression for the capacity region. Since a product form is not 
required in a cooperative scenario, we observed that as soon as the common message encoder does not 
have access to CSI, then in any noisy setup, covering the cases where no CSIR or noisy CSIR, it is 
possible to obtain the capacity region. 

Finally, the following further problems are worth to be explored: the complete characterization of 
the capacity region for the problem defined in Section II and its non-causal extension, the cooperative 
FS-MAC where both encoders observe causal noisy CSI and the cooperative FS-MAC where informed 
encoder observe noisy CSI non-causally and the other encoder observes noisy CSI with delay. 
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Appendix A 

Converse Proof of Theorem 2.3: Non-Causal Case 

Proof: Let 

Observe that (/x p : /Uf) € <S" _1 , where : w) denotes the concatenation of two vectors v and w, and 

2 a *w = ^ 2 p S[i.t-u.sr t+ i,„ 1 (/ i P'/ i f) = L 

(^ p :Atf)e<S r ™- 1 l<t<n Mp.Mf 

Lemma A. 1: Assume that a rate pair R = (R a ,Rb), with block length n > 1 and a constant e € 
(0, 1/2), is achievable. Then, 

R a < Yl ^l*t . ^ 5 [t-i] = Mp, Sf t+ i, n] = + V(e) (86) 

(/i P :/if) 

< £ <W ! ST, Sf t -i] = M P , Sf t+ i, n] = A* ) + »/(e) (87) 

(Mp : Mf ) 

Ra + Rb < £ «M P , f *t ; Sf t _i] = Mp, Sf t+ i, n] = Mf) + (88) 
Proof: Let us first consider the sum-rate. With standard steps, we get 

#a + # 6 < ^±(l(WiY [n] ,S{ n] ) + H(ej). (89) 
Note that since is independent of W, we have J(W; Y[ n ], S 1 ^) = 7(W; Y[ n ] |S*^) and 

n 

*=i 

< jr [H(Y t \S[ n] ) - H(Y t \W,Sl n] ,Y [t ^) 
t=i 

n 

(=1 

n 

t=i 

n 

= J2l(X t ;Y t \S{ n] ) (90) 
t=i 

where (i) follows since conditioning reduces entropy, (ii) holds since X\ = cj)[ l \Wi, P(S^)), i = {a, b}, 
and (Hi) is due to (3). Combining (89) and (90) similar to (21), gives 

1 n 

Ra + Rb<~J2 X t S ^[n]) + (91) 

71 t=\ 
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Furthermore, 



I(X^X b ;Y t \S\ n] ) = n Y a^AX^XhnS^Sl^ = ^S\ t+lM = /*), 

(Mp^f) 

and substituting the above into (91) yields (88). 

Let us now consider encoder a. Using Fano's inequality and standard steps we first get, 

1 1 



Ra < 



1 — en 



I(W a -Y [n] ,S\ n] ) + H(ej). 



Furthermore, 



(0 



I(W a ;Y [n] ,S[ n] ) < I(W a ;Y [n] \S[ n] ,W b ) 

n 

= [H(Y t \Sl n] ,Y [t _ 1} ,W b ) - HWS^Y^W) 



t=i 



< Y [ H W S U , W b ) -H(Y t \ Sf n] , Y [t _ 1} , W) 



t=i 



t=i 



(iv) 



< Y, [H{Yt\S\ nV X b t ) - H^S^Y^ , W, X [n] ) 
t=i 

n 

( = } Y [H{Y t \S\ n] ,X b t ) - H{Y t \S\ nV XlX?) 
t=i 

= Yl{X?-Y t \XlS\ n] ) 



t=i 



(92) 



(93) 



(94) 



where (i) is due to (2) and conditioning reduces entropy, (ii) holds since conditioning reduces entropy, 
(Hi) holds since X\ = </>( l \Wi, f l (S^)), i = {a, b}, (iv) is valid since conditioning reduces entropy 
and finally, (v) is valid due to (3) and S\, i = {a, b}, being a function of S*[. 

Now combining (93)-(94) and following steps akin to (91) and (92), we can verify (86). To verify (87) 
for encoder b it is enough to switch the roles of encoder a and (b). ■ 
Observe now that for any t > 1, I(Xf,X b ;Y t \Sl,S^ t _ 1 j = M p ,<S^ +ln ] = Uf) is a function of the 
conditional distribution Px°,x%,Y t ,sz\S[ t lv S( t+1 ] ( x t > x t > Vu s t IMp^ A*f )• Hence, we need to show that this 
distribution factorizes as in (44). Let 

? a ^ f (x a ,f a (s r )) := {w a :4 a) (w a J a (u p ,u f ),r(s r ))=x a }, 

T* p , Mf (x\ f b (s r )) ■= {w b : 4 b) (wb, /V P , /*), / V)) = x b } (95) 
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and 

1 



(*°l/> r )) ■= £ 



Lemma A.2: For every 1 < i < n and (/i p : /if) G the following holds 

^x t -,x»,y t) Sf \s [t _ 1]: s [t+1<n] OA x b , y, s r |/i p , /if ) 



j [t-i]> j [t+ 

( r X\T3_ („.\„ r n„ a nJ>\^P 

Proof: First observe that due to (3) we have 



= P 5 ^)P y |^ M (y|^*V^ (97) 



^x^.sns^.S^^ OA x\ y, s r \fi p , /if) 



= ^y t |5^,xr,xKyl s ^ x ^^) P Xf,x t ^5 t 15 rt _ 1] ,5 rt+ll „ ] (^ a ^^s r |^P^f)• (98) 
Let us now consider the second term in (98). We have 

Px^xi,snsi t _ iv s [t+lin] OA x\ s r \fi p , He) 

= E £ f W,Xf,^ 1 sr|S f; _ 1]) s i ; +lin] (w,x ,a; 6 ,a r |Mp,Mf) 



(0 



E E 1 {* , =* (,) («'i./ , (« r .^,^)) 1 i=a,6} i V.,w»,sr|Sf t _ 11 ,s [ , ; +1 , B] ( t<, o' 1l, 6' sr l/ i P'/ i f) 



E E ^{i'=</' ( ''(»i,/ l (sMip,/'f)), Z=a,6} iyy I ivy |-^ s t( s ) 
= - P 5r(s r ) £ |yyT 1 {*°=* (a) («'../ a (« r .^.*«*))} E hXn^^^^K./H^^p.Mf))} 

( i } P 3 r{f)i%$^\nsl)^p{Af t {' r )) (99) 

where (i) follows since X\ = <^>W(Wj, f l (S^)), i = {a, b}, (ii) is valid since W a and Wj, are independent 
of 5^ and state process being i.i.d. and (in) follows due to (95) and (96). Substituting (99) in (98) 
completes the proof. ■ 
We can now complete the proof of Theorem 2.3. With Lemma A. 1 , it is shown that any achievable rate pair 
can be approximated by the convex combinations of rate conditions given in (41)-(43) which are indexed 
by (/i p , /if) and satisfy (44) for joint state-input-output distributions. Hence, since lim e ^o v( € ) = 0> an Y 
achievable rate pair belongs to co( \J- IZ^^Tt) I . ■ 
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Appendix B 
Converse Proof of Theorem 3.1 

Proof: In the proof, we will use the fact that the delayed setup can be modeled by taking the last 
d a , db entries of causal setup as empty. Recall that a M is defined in (85). 

Lemma B.l: Assume that a rate pair R = (R a ,Rb), with block length n > 1 and a constant e <G 
(0, 1/2), is achievable. Then, 

Ra < Y ^nX^YtlXlSuSw =A*) + 77(e) (100) 
R h < Y anUXkYtlX^SuS^^ri + riie) (101) 

Ra + Rb < Y «^W,^t;^|5 t ,5 [t _i]=/i) + »/(e). (102) 

neSM 

Proof: For the sum-rate, observe that the derivation in (20) can be performed to verify (102), as for 
ck > 1, T\ = X\ by taking 5f t _ di+1)t _ 1] = 0, i = {a, b}. 
Let us now consider encoder a. We have 

R a < ^log(|W a |) < ^—^ {l(W a ;Y [n] ,S [n] ) + H(e)) . (103) 

Furthermore, 

I(W a ;Y [n] ,S [n] ) < I(W a ;Y [n] ,S [n] \W b ,S[ n] ) 

n 

= Y [^,5 t |5 [t _ 1] ,y [t _ 1] ,W 6 ,5f n] ) -HiY^StlS^Y^W,^ 
t=i 

n 

^ Y [^^[ t ]^[ t -l],^,^])-^ y *l S W' y [*-l]' W ' S N\ 



(m) 



t=l 
n 



Y [HiYlS^Y^W^S^X^) -HiYlS^Y^W^^X^) 
t=i 



< Y [ H ( Y t\ S [th x t) -HiYtlS^Y^W^^X^X^) 
t=i 

n 

{ => Y [H(Yt\S [t] ,X$) - H(Y t \S [t] ,X»,X?) 
t=i 

n 

= Y I{ ^ X t^\ X l S [t]) d04) 
t=i 

where (i) is due to (2) and conditioning reduces entropy, (ii) is valid since 

Ps t \St{s t \s\) = i's t |y [t _ 1]) S [t _ 1] ,W a ,W fc) S» l] (at|l/[t-l]»*[t-l]»«'a,«'6,sfn]) 
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= ^s t |y [t _ 1]l s [t _ 11) w tl Sf B] (*tb[t-i],S[t-i].«'6,af n] ) (105) 

where the second equality is due to (2), (Hi) is valid since = <j)f^ (w^, S^ t _ db ^j, (iv) is valid since 
conditioning reduces entropy and finally, (v) is valid by (3). 

Hie] 

Now, recall that x(e) = ra (i_ e ) an d, combining (103) and (104) gives 

n 

n t=l 

Furthermore, 

I{X«-Y t \X h u S [t] ) = n £ a M /(Xf;y t |X t 6 , ) S't,5 [t _ 1] = M ), (107) 

and substituting the above into (106) yields (100). 

Finally, for encoder b, (101) can be verified by following the similar steps of encoder a. ■ 
Now since, for any t > 1, conditional mutual information terms given in (100)-(102) are functions of 

Px? ,x?,Y t ,s t \S[t-i]( xa i xb 'Hi slf 1 )' i n order to complete the proof of the converse, we need to show that 
this term factorizes as in (50). 

Lemma B.2: For every 1 < t < n and /x G the following holds 

^^y t ,s t |s [t _JsV^^ (108) 

Note that one of the crucial step in verifying the product form for the causal setup, see (18) and (19), 
is the independence of Shannon strategies of the current state. This also holds in the delayed setup. 
Therefore, let 

(x*) :={ Wi : <j>f (wi, s\ t _ di] = W ) = x 1 }, i = a,b (109) 

and 

: = Tyy~r ^x^) — Jl^x^Psi^S^M^' i = a,b. 

Hence, (108) can be shown following the same steps in Lemma 2.2. 

We can now complete the converse proof of Theorem 3.1. With Lemma B.l it is shown that any 
achievable rate pair can be approximated by the convex combinations of rate conditions which are 
indexed by [i € and satisfy (50) for joint state-input-output distributions. Hence, any achievable 
pair (R a ,R b ) £cd(\J-TZ DN (TT)). ■ 
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Appendix C 

ACHIEVABILITY AND CONVERSE PROOFS OF THEOREM 4.1 

Achievability Proof: Fix (R a , Rb) G 1Zc{tt). 

Codebook Generation Fix 7rx"(x a ) and 7r T b\ Xa (t b \x a ). For each w a G {l, "" , 2 ni?a }, randomly 
generate xj 1 ^ w , each according to Yl™ =1 7r ^,° Reveal this codebook to encoder 6 and, for each 

iff, G {1, • • • , 2 nRb }, encoder 6 randomly generates t^ Wb , each according to FJILi ^T^Xf {A,w b \ x i,w a )- 
These codeword pairs form the codebook, which is revealed to the decoder. 

Encoding Define the encoding functions as follows: xf(w a ) = (f)f{w a , s^_ d j) and x\{wb) = 4>\{ w b, sj^) = 
t\ Wb {s\) where x? Wa and t\ Wb denote the ith component of x?, ^ and t\ n i Wb , respectively. Therefore, 
to send the messages w a and Wb, transmit the corresponding x?-, w and ^ respectively. 

Decoding After receiving (j/[ n j , s [ n ] ) , the decoder looks for the only (w a , Wb) pair such that (x" n j w , ij^ w& , 
2/[n]i s [n]) 316 jointly e— typical and declares this pair as its estimate (w a , ibb)- 

Error Analysis Assume that (w a ,w b ) = (1,1) was sent. Let E a ^={ (X^ a , Tj^, Y [n] , S [n] ) G A™}, 
a € {1, • • • ,2 ni? »} and /3 G {1, • • • ,2 nRb }. Then 

Pe=p{Eti U ^)<^i c ,i)+ E E p (*wo- ( n °) 

Since {Yj, 5j, X", Xf is an i.i.d. sequence hence, P(Ef^) — > for n — > oo. Next, let us consider the 
second term 

E ^«=wi) = E n^],i,rfwY M ,s M )eAZ) 

~ E E P T^X« n] {t\ n] \x\ n] )P X ^ 

< \A™\2- n{H(Tb \ xa) - e h- n{H{xay > s) - t] 

a=Wl 

< 2 nR *2~ n ^ H ( Tb \ xa } +H ( xa,Y ' S ' l ~ H( ' Xa ' Tb,Y ' S ' , ~ 3e } 

(*») 2 n [^™ / ( Tb ; y l 5 ' x °)~ 3e ] (ill) 

where (i) holds since ^ is independent of (Yj n ], given 1 and (zi) follows since 

H(T b \X a ) + H(X a , Y, S) - H(X a ,T b , Y, S) 

= H{T b \X a ) + H(X a , Y, S) - H(Y\X a ,T b , S) - H(X a ,T b , S) 
= H(X a ,Y,S) - H(Y\X a ,T b ,S) - H(X a ,S) 
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= I(T b ;Y\S,X a ) 

where the second equality follows since T b and S are independent given X a . Finally, 



E E P 3f Bl ,X [ - 1 (*[n]> a; fn]) i V[»],S [ „](y[n],S[n]) 



E = E p « I H a - I [n],^n],SH)e^) 

(m) 

a^l^l (xf n] ,j/ [n ] ,s w )e^ 

< J] |^n| 2 -n[H(T 6 ,X")-e] 2 -n^(y,5)-e] 

< 2 ri ( R »+' Ri ')2 _n ^ (Ti> ' Xa)+ ' f/(y ' 5)_ - f/(xa ' Tb ' y ' 5) ~ 3e] 

(™) 2™[ i? -+- Ri '- / ( xa ' Tb ; y l s ')- 3e ] (112) 

where (in) holds since for a, (3 / 1, (Tj^ ^, Q ) is independent of (Yj„], £[„]) and (w) follows since 
H(T b , X a ) + fl-(y, 5) - (X a , T 6 , y S) 

= H(T b , X a ) + H(Y, S) - H(Y\X a , S, T b ) - H(X a , S, T b ) 

= H(T b , X a ) + H(Y, S) - H(Y\X a , S, T b ) - H(X a , T b ) - H(S) 

= I(X a ,T b ;Y\S), 

and the rate conditions of the TZc{^) imply that each term tends in (110) tends to zero as n —> oo. ■ 
Note that the main motivation in indexing mutual information terms by the past CSI, is to get a product 
form on the team policies. In the cooperative setup, we do not require a product form and therefore, the 
convex combination argument is not essential. However, we herein keep this indexing (see (54)) to avoid 
the use of a time sharing auxiliary random variable. 

Converse Proof: First observe that, since X b = ^ (w a , Wb, S b t _^,sfj, we have 

T b = 4 b) (w a , W b , S b t _ 1} ) G (113) 

Lemma C.l: Let T b € Tb be the Shannon strategy induced by 4>t^ as shown in (113). Assume that a 
rate pair R = (R a ,Rb), with block length n > 1 and a constant e <G (0, 1/2), is achievable. Then, 

Rb < E ^f; 1 *! 1 ^^!*-!] =H) + V(e) (114) 
R a + R b < E VWJ?;l<|S( I SH]=/i) + '!(£) (H5) 
where and 77(e) are defined in (14). 
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Proof: Let us first consider the sum-rate condition. Since, 

n 

I(W;Y [n] ,S [n] ) < ^[H(Y t \S [t] )-H(Y t \W,S [t] ,Y [t _ 1]} Xf,T* 
t=i 

n 

t=i 

n 

= ^J(X»,T t b ;F t |S [t] ), 
t=i 

where (i) can be shown in a similar way as (21), we have, 

1 n 

R a + R b <-Y, 7 (*"> ^ Y t \S [t] ) + 77(e) 

and 



I(X?,T?;Y t \S [t] ) = n ^I(Xt^-Y t \S t ,S [t ^ = »). 

Substituting the above into (117) yields (115). 

Let us now consider encoder b. With Fano's inequality and standard steps, we get 

R b < ilog(|W 6 |) < ^—^ (l(W b ;Y [n] ,S [n] ) + H(e)) . 



(116) 



(117) 



(118) 



(119) 



Following similar reasonings as in (104) we get, 

I(W b ;Y [n] ,S [n] ) < I(W b ;Y [n] ,S [n] \W a ,S^ n] ) 

n 

= £ [HiYtlS^Y^W^S^) -H(Y t \S [t] ,Y [t _ lh W a ,W b ,S? n] ) 
t=i 

n 

= £ [H(Y t \S w Y [t _ lh W a ,Sf n] ,Xf n] ) - H(Y t \S [t] ,Y [t ^W a ,W b ,Sf n] ,Xf n] ) 
t=i 

n 

< £ [H(Y t \S [t] ,X?) -H(Y t \S [t] ,Y [t _q,W a ,W b ,S? n] ,Xf n] ,Tt 
t=i 

n 

( = } [H{Yt\S W X?) - H(Y t \S [t] ,X?,T? 
t=i 

n 

= ^I(T b -Y t \X?,S [t] ) 



(120) 



t=i 



where (i) is valid since 



^V t |S W) y [t _ 1]) W,Si» B]> X [ - ] ,T»(ftl s [t]'f[t-l]' W ' S fn]' 

= p Y t \s t ,si,x?,T? (Vt\s t , s b t , x?,t b t )P s > lS[thY[t _ ihW:S « n]>x « n]:Ttb (s b t \s [t] , , w, sf n] , xf n] , 
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= Yl P Y t \S t ,SlX?,T?(yt\St,S b t ,X?,t b t )P s > lSt (s b t \s t ) 
s$£S b 

= PY t \s u x?,n(.yt\st,x?,t b t ). (121) 

where the first equality is due to (3) and the second equality is due to (1) and (2). Following (21), we 
can directly verify (114). ■ 
We now need to show that the joint conditional distribution of channel state St, inputs X®, T b and output 
Yt given the past realization (5jt-i] = /x), i.e., Px?,T*>,Y t ,s AS H--a( x<l i * > Vi s IaO> factorizes as in (54). This 
is straightforward. Let first ir Xa Tb (x a , t b ) := Px*,T}\s l t- 1 - i ( xa it b \f 1 ) an ^ observe that 

Px f ,T t \Y t ,s t \S lt . 1] (x a ,t b ,y,s\ l i) 

= E p r t |x^x^5 t (yk^^ 6 (4)^)P5 t 15 t (4l^)P5 t (5)Px f ^|5 [t _ 11 (^^^V) 

s b t es" 

= ^ XayTb (x a ,t b )P St ( S )P Ytlx ^ TtbySt (y\x a ,t b ,s) (122) 

where the equalities are verified by (3), by (1) and by the fact that {Xf,T b ) is independent of St. ■ 
We can now complete the converse proof of Theorem 4.1. With Lemma C.l it is shown that any 
achievable rate pair can be approximated by the convex combinations of rate conditions which are 
indexed by [i € and satisfy (54) for joint state-input-output distributions. Hence, any achievable 
pair (R a ,R b ) £ co(U^c(vr))- 

Appendix D 
Proof of Cf s = C A s 

Let us first show that C$ s C C AS . Recall that T G |T| = \X b \^ and \U\ < \X a \\X b \\S\ + 1. Hence, we 
have either \U\ > \T\ or else. In the case where \U\ < \T\, we note that \U\ is limited to a finite set without 
loss of generality. Hence, we can always take \U\ at least |T| such that it satisfies (62), (63) and (64). 
Then we can directly conclude that Cp S C Cas since P X b\ s ^ T (x b \s, t) = P X b\ s ^ T (x b \s, t, x a ) = l{ x h =t(s)} 
and this is a special case of Pxb\u,X a ,s( xb \ u i x< \ s )- 

In order to prove the other direction, i.e., Cas Q Cp S , let C\ s be the closure of all rate pairs (R a , Rj,) 
satisfying 

R b < I(U;Y\X a ) (123) 
R b + R a < I(U,X a ;Y) (124) 
for some joint measure on S x X a x X b x y x U having the form 

PY\X",x\s{y\x a i x b , s)l {xb=m{s ^ a ^ u)} Ps{s)P X a )U {x a , u), (125) 
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for some m : U x X a x S — >■ A^, where < |«S||Af ||A^| + 1, and we first show that Cas = Cas> an( ^ 
following this, we show that C\ s C 
Lemma D.l: Cas = Cas- 

Proof: Obviously C^ s C C^s and hence, we will show that Cas C^f s . Let Px b ,x<*,u,s( xl \ xa i n > s ) 
be a joint distribution in the form of (64), i.e., 

Px>,x;u,s( xb > x ^ u > s ) = Px>\x°,u,s( xb \ xa > u > s)Ps(s)Px«,u(x a ,u). (126) 

Let A denote a \X a \ \U\ |<S|-by-|A^| matrix where Aijkl = Px h \x a ,u,s{Aji ^; 0> 1 — * — 1^1' 1 — i — 

|^o| 5 1 < & < |W| and 1 < I < \S\. Hence, A is a |A' a ||Z^||<S|-by-|A'{ ) | row stochastic matrix, i.e., 

_ x I _ 

Ajjfci > 0, k, I and Yl\=i ^-ijkl = lj Yj, /c, Z. Let A denote a |A^ ||W||«S|-by-|A^| binary stochastic 

matrix, that is a matrix with each row has exactly one non-zero element, which is 1. Observe now that 
any row stochastic matrix can be written as a convex combination of binary stochastic matrices (e.g., see 

[28, Lemma 5] and [29, Proposition IV. 1]). Therefore, we have 

k k 

A = ^A,A«, J>* = 1, (127) 

i=i i=i 

where AW is a binary stochastic matrix and by [28, Lemma 5], k < (\X a \\U\\S\) 2 . 
Let, for the joint distribution Px b ,x a ,u,s{ xb i x<1 i u i s )> 

R b < I(U;Y\X a ) A , (128) 

R a + R b < I(U,X a ;Y) A . (129) 

Therefore, (R a ,Rb) € Cas- Now, observe that for a fix distribution Px<*,u(x a , u), both I(U, X a ;Y) and 
I(U;Y\X a ) are convex in pY\x a ,u(y\ xa -> u ) an d hence, convex in Px b \x a ,u,s('\ x<1 i n > s )- This and (127) 
imply that 

k 

I(U;Y\X a ) A < J2\iI(U;Y\X a ) AW , (130) 

i=i 
fc 

I([/,X a ; y) A < J]A,/(?7,X a ;y) A(l) , (131) 
1=1 

where /(£/"; Y \X a ) A i,i) and /(£/, X a ; Y) A w denote the mutual information terms induced by AW. 
Now, let {Ri,Rl), 1 < i < k, be such that 

Rl<I(U;Y\X a ) A ^, 
Rt + Ri<I(U,X a ;Y) Ali) , 
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and hence, CR*,i2j) G C| 5 , 1 < i < fc. Let = Eti HK^b)- Since 

a convex combination 

of achievable rates is also achievable, so (r[,r[) G (jf 5 . This observation and inequalities (128)-(131) 
complete the claim that (R a , Rj,) G C^ s . ■ 
Up to now, we have shown that Cp S C C^g and C^ s = Cas- m order to prove that = Cas> it 
remains to show that C^s C C^ 5 . Note that £/f 5 still depends on Px^,u(x a ,u) in which |W| can be 
larger than |T|. Hence, in the next lemma we basically show that for every Px a ,u(x a , u), there exists a 
ttt" ,u (t a , u) which induces the same rate constraints as induced by Px a ,u(x a , u). 
Lemma D.2: C% s C Cf s . 

Proof: Let us fix a joint distribution P Y X a X b usiVi ^"j x b , u, s) satisfying (125), i.e., 

P Y,X«,X\U,s(y> x ^ «> s ) = P Y\X«,X»,s(y\ xa > s ) 1 {x»=m(s,x«,u)}Ps(s)P X *,u( xa i u )- ( 132 ) 

Observe that for every m satisfying x b = m(u, x a , s), one can define 

x b = m(u,x a ,s) = m(x a ,u)(s), ih(x a ,u)eT, (133) 
where T is the set of all mappings from S to Xj,. Now, let 

(m Y\X a )p hxa Mv^u), W X a ; Y) P * xa u{y ^ u) ) , (134) 
denote the mutual information pair induced by P Y x a uiVi x a ,u). We have 
I(U,X a ;Y) P , xaui y tX a tU) 

\ — ■> v — v — "v ^Y U X a ^ ) 

= EEE E ^w,T(^M)iog p ™- (wa) 



terueuyeyx^ P Y (y)P^(u,x*) 

- 2^2^ 2^ Av*.,tf,T(v,* ,«,*)io g 

= EEE E ^.^T(y,x°,«,t)iog— — — — 
= EEE E t , r (y,x»,n,t)io g ^'^'f 



EEE ^<*^>^fe^ 

7(T,X a ; y)p, xaT( ^ it) , (135) 
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where (i) is valid since ui(x a ,u) G T, i.e., for each (x a ,u) there exists only one t G T such that 

PT\X a ,u{t\x a ,u) = 1, (ii) is valid since 

( i } £p; |xa;W ( y |zv,^)p 5 ( S ) ( = } E^i^sO/I^m^oo 

se5 ses 

= J2 p Y,s\x»,Ay^K^) = P£\x;T(v\x a ,t), ( 136 ) 

seS 

where (in) is valid since S and (X a ,T, U) are independent and (iu) is valid due to (3). Similarly, we 
have 

I(U;Y\X a )p* xau{y ^ a ^ u) 

D * i a \ i p Y,u\x«(yM xa ) 

= EE E *W(*,* ■") 1 °g p> (y|^ |Jc . (u|s«) 
ueUyey x a ex a Y\x a \»\ > u\x«\ i 

= EE E 

= Z^Z^Z^Z^ Py iX o iC/)T (y,x a ,n,t)log. 



= EEE E JW,t(V,* ,«,t)log ( |ga)f » fr^a) 

= EEE E ^..^(y, x-.ti,*) log ^ 4p^^y 



= EE E ^W^^pT , , . , . . 

= I(T;Y\X a ) P , xaT(y ^ t) , (137) 

where (v) and (vi) follows from the same reasonings of (i) and (ii), respectively. Now, let R b < 
I(U;Y\X a ) PPxau{yjX * jU) and R b + R' a < I{U,X a -Y) P . xau{y , x%u) . Hence, (R' a ,R b ) G C| 5 . Observe 
now that for a distribution in the form of Pyx a t(V' x<X -> *)> one can define rrx a ,T{x a , t) = P^ a T (x a , t). 
Therefore, since C PS = col (J- ^(ir) J , and due to (135) and (137), (R' a , R b ) G C PS , which completes 



the claim. 



References 

[1] G. Como and S. Yiiksel, "On the capacity of memoryless finite state multiple access channels with asymmetric state 

information at the encoders," IEEE Trans. Inform. Theory, vol. 57, no. 3, pp. 1267-1273, March 2011. 
[2] C. E. Shannon, "Channels with side information at the transmitter," IBM J. Res. Develop, vol. 2, pp. 289-293, 1958. 



January 20, 2012 



DRAFT 



42 



[3] S. I. Gelfand and M. S. Pinsker, "Coding for channels with random parameters," Prob. ofCont. and Inform., vol. 9, pp. 19-31, 
1980. 

[4] M. Salehi, "Capacity and coding for memories with real-time noisy defect information at encoder and decoder," IEE- 

Proceedings-1 vol. 139, pp. 113-117, Apr. 1992. 
[5] G. Caire and S. Shamai, "On the capacity of some channels with channel state information," IEEE Trans. Inform. Theory, 

vol. 45, no. 6, pp. 2007-2019, Sep. 1999. 
[6] U. Erez and R. Zamir, "Noise prediction for channels with side information at the transmitter," IEEE Trans. Inform. Theory, 

vol. 46, no. 4, pp. 1610-1617, July 2000. 
[7] A. Goldsmith and P. Varaiya, "Capacity of fading channels with channel side information," IEEE Trans. Inform. Theory, 

vol. 43, no. 6, pp. 1986-1912, Nov. 1997. 
[8] S. Yiiksel and S. Tatikonda, "Capacity of Markov channels with partial state feedback," Proc. IEEE Int. Symp. Information 

Theory Nice, France, pp. 1861-1865, June 2007. 
[9] S. Tatikonda and S. Mitter, "The capacity of channels with feedback," IEEE Trans. Inform. Theory, vol. 55, no.l, pp. 323-349, 

Jan. 2009. 

[10] H. Permuter, T. Weissman and A.J. Goldsmith, "Finite state channels with time-invariant deterministic feedback," IEEE 

Trans. Inform. Theory, vol. 55, no.2, pp. 644-662 , Feb. 2009. 
[11] A. Das and P. Narayan, "Capacities of time- varying multiple access channels with side information," IEEE Trans. Inform. 

Theory, vol. 48, no. 1, pp. 4-25, Jan. 2002. 
[12] S. A. Jafar, "Capacity with casual and non-casual side information-a unified view," IEEE Trans. Inform. Theory, vol. 52 

no. 12, pp. 5468-5474, Dec. 2006. 
[13] Y. Cemal and Y. Steinberg, "The multiple-access channel with partial state information at the encoders," IEEE Trans. 

Inform. Theory, vol. 51, no. 11, pp. 3992-4003, Nov. 2005. 
[14] A. Lapidoth and Y. Steinberg, "The multiple-access channel with two independent states each known causally to one 

encoder," Proc. IEEE Int. Symp. Information Theory Austin, Texas, U.S.A, June 2010. 
[15] M. Li, O. Simeone, and A. Yener, "Multiple access channels with states causally known at transmitters," available at 

[arXiv:1011.6639]. 

[16] A. Lapidoth and Y. Steinberg, "A note on multiple-access channels with strictly-causal state information," available at 
[arXiv:1106.0380vl]. 

[17] U. Basher, A. Shirazi and H. Permuter, "Capacity region of finite state multiple-access channel with delayed state 

information at the transmitters," available at [arXiv:1101.2389]. 
[18] A. Somekh-Baruch, S. Shamai and S. Verdu; "Cooperative multiple-access encoding with states available at one transmitter," 

IEEE Trans. Inform. Theory, vol. 54, no. 10, pp. 4448-4469, Oct. 2008. 
[19] S. Kotagiri and J. Laneman, "Multiaccess channels with state known to one encoder: A case of degraded message sets," 

Proc. IEEE Int. Symp. Information Theory Nice, France, pp. 1566-1570, June 2007. 
[20] A. Zaidi, P. Piantanida and S. Shamai, "Multiple access channel with states known noncausally at one encoder and only 

strictly causally at the other encoder," Proc. IEEE Int. Symp. Information Theory St. Petersburg, Russia, July 2011. 
[21] H. Permuter, S. Shamai and A. Somekh-Baruch, "Message and state cooperation in multiple access channels," IEEE Trans. 

Inform. Theory, vol. 57, no. 10, pp. 6379-6396, Oct. 2011. 
[22] G. Keshet, Y. Steinberg, and N. Merhav, "Channel coding in the presence of side information: Subject review," Found. 

Trends Commun. Inf. Theory, vol. 4, no. 6, pp. 445-586 , 2007. 



January 20, 2012 



DRAFT 



43 



[23] H. S. Witsenhausen, "Equivalent stochastic control problems," Mathematics of Control, Signal and Systems, vol. 1, pp. 3-11, 
Springer- Verlag, 1988. 

[24] S. Yiiksel, "On optimal causal coding of partially observed Markov sources in single and multi-terminal settings," available 
at [arXiv:1010.4824v2]. 

[25] A. Mahajan and D. Teneketzis, "Optimal performance of networked control systems with non-classical information 

structures," SI AM Jour. Cont. and Opt. vol. 48, no. 3, pp. 1377-1404, May 2009. 
[26] A. Nayyar and D. Teneketzis, "On the structure of real-time encoders and decoders in a multi-terminal communication 

system," IEEE Trans. Inform. Theory, vol. 57, no. 9, pp. 6196-6214, Sep. 2011. 
[27] T. M. Cover and J. Thomas, Elements of Information Theory, New York: Wiley, 2nd edition, 2006. 
[28] G. Hognas, "Random semigroup acts on a finite set," J. Austral. Math. Soc, vol. 23 (Series A), pp. 481-498, 1977. 
[29] U. Niesen, C. Fragouli, D. Tuninetti, "On capacity of line networks," IEEE Trans. Inform. Theory, vol. 53, no. 11, pp. 4039- 

4058, Nov. 2007. 



January 20, 2012 



DRAFT 



